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ABSTRACT 

The  Objective  Force  requirements  of  responsiveness, 
agility  and  versatility  call  for  digitized  graphical  decision 
support  interfaces  that  automate  or  otherwise  help  in 
various  reasoning  tasks,  including  reasoning  with  visual 
and  diagrammatic  representations  that  are  ubiquitous  in 
Army  situation  understanding,  planning  and  plan 
monitoring.  In  earlier  papers,  we  described  a 
diagrammatic  reasoning  architecture,  and  demonstrated 
the  approach  for  instances  of  maneuver  recognition  and 
information  fusion  for  entity-reidentification  problems. 
The  current  paper  characterizes  the  computational 
properties  of  the  core  perception  and  path-finding 
algorithms  that  are  an  important  part  of  the  technology’s 
application  for  a  class  of  fusion  problems.  This  analysis  is 
important  since  the  practicality  of  automating 
diagrammatic  reasoning  for  Army  applications  depends 
on  developing  algorithms  with  manageable  complexity. 

1.  INTRODUCTION 

Reasoning  with  visual  representations  consisting  of 
terrain  maps  with  an  overlay  of  diagrammatic  elements  is 
ubiquitous  in  Army  situation  understanding,  planning  and 
plan  monitoring.  Diagrams  are  overlaid  on  top  of  terrain 
maps,  and  they  represent  information  using  a  combination 
of  iconic  and  spatially  veridical  elements.  The  overall 
problem  solving  process  is  a  sequence  of  steps  each  of 
which  is  one  of  three  types:  perception  on  the  diagram,  in 
which  information  about  the  spatial  properties  of  or 
relations  between  diagrammatic  objects  is  obtained; 
inference,  making  use  of  currently  available  symbolic 
information  including  information  obtained  by 
perception;  and  actions  on  the  diagram,  in  which 
diagrammatic  elements  are  added,  deleted  or  modified  to 
satisfy  certain  constraints,  such  as  “find  a  path  that  goes 
from  point  A  to  point  B,  while  avoiding  region  C.” 
Automating  or  semi-automating  such  reasoning  tasks  is 
essential  if  the  ambitious  goals  of  Army  Transformation 
based  on  information  dominance  are  to  be  achieved. 

We  have  been  experimenting  with  an  architecture 
[Chandrasekaran,  et  al,  2002;  Chandrasekaran,  et  al, 
2004]  for  automated  reasoning  with  diagrams  and  applied 
it  to  example  problems  in  maneuver  recognition  and 
information  fusion  for  entity  reidentification.  In  this 
paper,  we  present  the  algorithms  and  their  computational 


Fig  1  A  diagram  in  ASAS. 


complexities  for  the  set  of  perceptual  and  action  routines 
that  we  have  found  useful  in  spatial  reasoning  tasks 
involved  in  certain  types  of  information  fusion.  This 
analysis  is  important  since  the  practicality  of  automating 
diagrammatic  reasoning  for  Army  applications  depends 
on  developing  algorithms  with  manageable  complexity. 

2.  DIAGRAMS  IN  A  FUSION  EXAMPLE 

The  entity  reidentification  problem  arises  in  systems 
such  as  U.S.  Army’s  All-Source  Analysis  System 
(ASAS).  The  prototypical  task  can  be  characterized  as 
follows.  A  report  is  received  about  the  sighting  of  an 
entity  of  interest,  along  with  the  time,  location  and  partial 
identity  information,  such  as  that  it  was  a  tank,  or  tank  of 
a  given  type,  etc.  The  task  is  to  decide  if  the  newly 
sighted  object  is  one  of  the  objects  in  the  database, 
sighted  and  identified  earlier,  or  a  new  object.  The  overall 
reasoning  process  is  modeled  as  abductive  inference 
[Josephson  &  Josephson,  1996].  The  reasoning  system 
has  a  number  of  diagrammatic  subtasks.  Fig  1  illustrates 
some  of  them.  The  three  regions  are  marked  as  no-go 
areas  for  the  vehicle  type  of  interest.  The  newly  sighted 
vehicle  is  the  small  circle  at  the  bottom  right  of  the  figure, 
and  the  database  has  identified  two  previously  sighted  and 
identified  vehicles,  the  two  small  circles  at  bottom  left 
and  top  of  the  figure,  as  being  potentially  the  same  as  the 
newly  sighted  vehicle.  The  problem  solver  asked  the 
diagrammatic  reasoner  to  identify  a  possible  path  from  the 
new  sighting  location  to  the  vehicle  at  the  top.  The  action 
component  of  the  diagrammatic  reasoner  found  a  path 
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between  the  two  no-go  regions.  The  two  elliptical  regions 
marked  Sensor  1  and  Sensor2  are  known  sensor  fields. 
The  problem  solver  wants  to  know  if  the  path  intersects 
any  of  the  sensor  fields.  The  diagrammatic  reasoner 
replies,  as  we  would,  that  the  path  intersects  Sensorl,  but 
not  Sensor2.  Then  the  problem  solver  asks  (not  shown  in 
Fig  1)  if  the  path  could  be  modified  so  as  to  avoid  the 
field  Sensorl  but  still  go  between  the  two  no-go  regions. 
The  relevant  path  modification  algorithm  finds  that  this  is 
not  possible.  Another  common  example  of  perception  is 
that  of  emergent  objects.  For  instance,  when  two  curves 
intersect,  a  new  point  object,  the  intersection  point,  is 
created.  This  may  be  of  significance  in  some  domain,  e.g., 
that  point  might  be  an  intersection  between  two  routes, 
providing  alternatives  for  path  planning.  Posing  a 
sequence  of  such  questions  to,  and  making  use  of  the 
answers  from,  the  diagrammatic  component,  the  problem 
solver  eventually  decides  that  the  new  sighting  could  not 
correspond  to  the  one  at  the  top. 

3.  DIAGRAMMATIC  REPRESENTATION 

It  is  important  to  distinguish  between  a  diagram  and  a 
general  image.  A  diagram  is  first  and  foremost  a 
representation,  i.e.,  it  is  not  an  image  of  a  natural  object, 
but  one  whose  objects  are  intended  to  represent 
information  to  support  some  reasoning  task.  A  diagram 
differs  from  linguistic-symbolic  representations  in  that  the 
spatial  properties  of  and  relations  between  diagrammatic 
objects  may  be  used  to  represent  information  in  the 
domain. 

A  Diagram  is  a  pair  (I ,  DDS)  where  I  is  the  image, 
defined  as  a  specification  -  implicit  or  explicit  -  of 
intensity  values  for  points  in  the  relevant  regions  of  2D 
space,  and  DDS  is  the  Diagram  Data  Structure,  which  is 
a  list  of  labels  for  the  diagrammatic  objects  in  the  image; 
associated  with  each  object  label  is  a  specification  of  the 
subset  of  I  that  corresponds  to  the  object.  A 
diagrammatic  object  can  be  one  of  three  types:  point, 
curve,  and  region.  Point  objects  only  have  location  (no 
spatial  extent),  curve  objects  only  have  axial  specification 
(no  thickness),  and  region  objects  have  location  and 
spatial  extent.  The  labels  are  internal  to  DDS.  External 
labels  such  as  A,  B,  etc.  in  Fig  2  are  additional  features  of 
the  objects  that  may  be  associated  with  them.  I  is  any 
description  from  which  a  specification  of  intensity  values 
for  the  relevant  points  in  the  2D  space  can  be  obtained.  In 
our  work  we  represent  curve  objects  by  a  sequence  of 
points  or  line  segments,  and  regions  by  the  perimeter 
closed  curves,  also  as  line  segments. 

Diagrams  are  constructed  by  placing  diagrammatic 
objects  on  a  2D  surface  in  specific  configurations  to 
represent  specific  information.  DDS  will  initially  consist 
of  the  labels  of  objects  so  placed  and  their  spatial 
specifications. 


A  perceptual  routine1  (PR)  takes  a  specified  number 
of  diagrammatic  objects  in  the  diagram  and  returns  a 
perception,  which  may  be  an  object,  a  property  or  a 
relation.  An  action  routine  (AR)  creates,  deletes  or 
modifies  objects  in  the  diagram  so  as  to  satisfy  given 
constraints,  some  of  which  may  be  perceptual. 

There  are  no  finite  sets  of  these  routines  that  suffice 
for  all  of  diagrammatic  reasoning,  but  they  can  be 
partially  ordered  with  respect  to  complexity  and  domain- 
specificity,  such  that  routines  later  in  the  partial  order 
make  use  of  routines  earlier. 


Fig.  2.  A  DDS  for  a  simple  diagram  composed  of  a  point, 
a  curve  and  a  region.  There  are  other  objects: 
distinguished  points  such  as  end  points,  the  closed  curve 
defining  the  periphery  of  the  region,  etc. 

As  objects  in  DDS  are  created,  deleted  or  modified, 
new  objects  might  emerge,  objects  may  change  their 
spatial  extents,  and  existing  objects  might  be  lost.  PRs 
will  make  these  determinations  and  DDS  will  be  updated 
to  reflect  these  changes.  DDS  mediates  the  interaction  of 
the  problem  solver  with  I  .  For  example,  given  a  question 
such  as  “Is  A  to  the  left  of  C?,”  the  information  in  DDS  is 
used  to  identify  the  image  descriptions  for  A  and  C  as 
arguments  for  the  PR  Lef tof (X,Y).  In  principle,  the 
same  image  I  might  correspond  to  different  DDS’s 
depending  on  which  subsets  are  organized  and  recognized 
as  objects,  such  as  in  the  well-known  example  of  a  figure 
that  might  be  perceived  as  a  wine  glass  or  profiles  of  two 
faces. 


1  We  call  them  perceptual  routines,  rather  than  the  more 
common  and  specific  visual  routines,  because  our  long¬ 
term  goal  is  to  extend  the  notion  of  the  cognitive  state  to 
multiple  perceptual  modalities,  and  vision  is  just  one 
modality. 


4.  PERCEPTUAL  ROUTINES  (PRs) 

PRs  can  be  categorized  into  two  classes,  emergent 
object  recognition,  and  property/relation  extraction.  The 
first  includes  domain-independent  PRs  that  identify  point, 
curve  and  region  objects  that  are  created  or  lost  when  a 
configuration  of  diagrammatic  objects  is  specified  or 
modified.  The  PRs  of  the  second  class  produce  symbolic 
descriptions  belonging  to  one  of  three  kinds:  (i)  specified 
properties  of  specified  objects  (e.g.,  curve  C  has  length  of 
m  units),  (ii)  relations  between  objects  (e.g.,  point  P  is  in 
region  R,  curve  Cj  is  a  segment  of  curve  C2,  object  Oi  is 
to  the  left  of  object  O?,  values  of  the  angles  made  by 
intersection  of  curves  Ci  and  C2),  and  (iii)  symbols  that 
name  an  object  or  a  configuration  of  objects  as  an 
instance  of  a  class,  such  as  a  triangle  or  a  telephone. 

The  PRs  of  the  second  class  come  in  different 
degrees  of  domain  specificity.  Properties  such  as  length  of 
curve,  area  of  a  region,  and  quantitative  and  qualitative 
(right,  acute,  obtuse,  etc.)  values  of  angles  made  by 
intersections  of  curves  are  very  general,  as  are 
subsumption  relations  between  objects,  such  as  that  curve 
Ci  is  a  segment  of  curve  C2.  Relations  such  as 
Insideof(A,  B),  Touches(A,  B),  and  Leftof(A,  B) 
are  also  quite  general.  PRs  that  recognize  that  a  curve  is  a 
straight  line,  a  closed  curve  is  a  triangle,  etc.,  are  useful 
for  reasoning  in  Euclidean  geometry,  along  with  relations 
such  as  Parallel(Linei,  Line2).  The  PRs  of  the  second 
class  are  open-ended  in  the  sense  that  increasingly 
domain-specific  perceptions  may  be  conceived:  e.g.,  an  L- 
shaped  region.  Half -way-between(Point  A,  Point  B). 
Our  goal  for  the  current  set  of  PRs  is  what  appears  to  be  a 
useful  general  set,  with  the  option  for  additional  special 
purpose  routines  later  on.  The  following  is  a  list  of  PRs  of 
different  types  that  we  have  currently  identified  and 
implemented  as  being  generally  useful. 

Emergent  Object  Recognition  Routines.  These  PRs 
return  one  or  more  objects  after  creating  or  recognizing 
them.  Examples  of  such  routines  are  finding  intersection- 
points  when  curve  and/or  region  objects  intersect,  region 
when  a  curve  closes  on  itself,  new  regions  when  regions 
intersect,  new  regions  when  a  curve  intersects  with  a 
region,  extracting  distinguished  points  on  a  curve  (such  as 
end  points)  or  in  a  region,  extracting  distinguished 
segments  of  a  curve  (such  as  those  created  when  two 
curves  intersect),  extracting  periphery  of  a  region  as  a 
closed  curve.  Reverse  operations  are  included  -  such  as 
when  a  curve  is  removed,  certain  region  objects  will  no 
longer  exist  and  need  to  be  removed. 

We  have  implemented  the  intersection  routine  using 
sweep-line  algorithm  [Bentley  &  Ottmann,  1979]  that 
takes  as  input  a  number  of  curve  and  region  objects  and 
computes  all  the  intersection  points  in  0((n+k)*log(n)) 
time  (see  Intersect  in  Table  1).  Due  to  an 


intersection,  a  number  of  emergent  objects  are  created. 
For  e.g.,  when  two  curves  intersect,  four  sub-curves  and 
an  intersection  point  emerge.  Elowever,  computing  all  the 
emergent  curves  and  regions  is  computationally  too 
expensive,  so  we  restrict  ourselves  to  computing  only  the 
emergent  first  order  objects  -  i.e.,  objects  that  do  not  have 
other  emergent  objects  of  the  same  type  as  subparts,  or 
objects  specifically  requested  by  the  problem  solver. 

Object  Property  Extraction  Routines.  These 
routines  might  return  a  numerical  value  as  in 
Length(Curve  C),  or  a  boolean  as  in  case  of 
Closed(Curve  C).  Length(Curve  C)  computes  the 
length  of  C  by  computing  the  sum  of  Euclidean  distances 
between  each  consecutive  pair  of  points.  Area(Region  R) 
computes  the  signed  area  of  R,  which  is  assumed  to  be  an 
arbitrary  non-self-intersecting  polygon  with  n  vertices, 
using  the  formula: 

1  n 

Area  =  -x(imod„)+1yi) 

^  /= 1 

where  (.y,  yj  is  the  coordinate  of  the  /'th  vertex. 

Counter-Clock-Wise(Region  R)  checks  whether  R 
is  oriented  in  counterclockwise  direction  by  checking 
whether  the  area  of  R  is  positive.  Angle(Point  Pj,  Point 
P2,  Point  P3)  computes  the  angle  between  the  line 
segments  P]P2  and  P2P3  at  point  P2. 
Straightline(Curve  C)  checks  whether  all  the  points 
on  C  are  collinear  or  not.  Closed(Curve  C)  checks 
whether  C  intersects  itself  by  using  the  Intersect  routine. 
Additional  property  extraction  routines  can  be  added  as 
needed. 

Relational  Perception  Routines.  These  routines 
return  a  boolean  value  after  checking  whether  one  or 
more  objects  satisfy  a  certain  relation.  Ins ideof  (Point 
P,  Region  R)  checks  whether  P  lies  inside  R  or  not.  In 
order  to  compute  whether  a  given  object  of  any  type  lies 
within  a  given  region  or  not,  we  check  for  every  point  of 
that  object  using  the  routine  Insideof .  Outsideof  is 
implemented  similarly. 

Lef  tof(Point  Pi,  Point  P2,  POV)  checks  whether  P 
is  to  the  left  of  P2  or  not  with  respect  to  the  given  point  of 
view  POV.  The  point  of  view  specifies  the  direction 
towards  which  the  observer  is  faced,  in  terms  of  an  angle 
with  respect  to  a  fixed  horizontal  axis.  Rightof (Point 
Pi,  Point  P2,  POV),  Above(Point  Pj,  Point  P2,  POV), 
Below(Point  Pb  Point  P2,  POV)  can  be  computed 
similarly.  Topof  (Region  Rb  Region  R2)  is  required  in 
domains  like  the  Blocks  World  where  one  block  might  be 
on  top  of  another  block.  In  such  cases,  we  would  consider 
two  blocks  as  regions  Rb  R2,  and  infer  that  R]  is  on  top  of 
R2  if  R]  and  R2  touch  each  other  at  more  than  one  point 
and  Ri  is  above  R2  with  respect  to  the  vertical  point  of 
view. 


Table  1.  Selected  Perceptual  Routines  and  Their  Computational  Complexities 


Class 

PR 

Input 

Output 

Computational  complexity 

PRs  used 

Quantitative 

PRs 

Distance 

Point  Pj, 

Point  P2 

A  real 
number 

0(1) 

- 

Angle 

Point  Pj, 

Point  P2, 

Point  P3 

A  real 
number 

0(1) 

- 

Area 

Region  R 

A  real 
number 

O(n) 

n  =  #  segments  in  R 

Length 

Curve  C 

A  real 
number 

O(n) 

n  =  #  segments  in  C 

Distance 

Qualitative 

PRs 

StraightLine 

Curve  C 

A  boolean 

O(n) 

n  =  #  segments  in  C 

- 

Closed 

Curve  C 

A  boolean 

0((n+k)*log(n)) 

n  =  #  segments  in  C,  k  =  #  intersections 

Intersect 

Counter- 

Clock-Wise 

Region  R 

A  boolean 

O(n) 

n  =  #  segments  in  R 

Area 

Leftof 

Point  Pj, 

Point  P2, 

Point  of 

View 

A  boolean 

0(1) 

- 

Rightof 

Point  Pj, 

Point  P2, 

Point  of 

View 

A  boolean 

0(1) 

Leftof 

Above 

Point  Pb 

Point  P2, 

Point  of 

View 

A  boolean 

0(1) 

- 

Below 

Point  Pj, 

Point  P2, 

Point  of 

View 

A  boolean 

0(1) 

Above 

On 

Point  P, 

Curve  C 

A  boolean 

O(n) 

n  =  #  segments  in  C 

- 

Touches 

Object  Oi, 
Object  02 

A  boolean 

0(n,*n2) 

nj  =  #  segments  in  Ob  n2  =  #  segments  in  02 

On 

Topof 

Region  R1; 
Region  R2 

A  boolean 

0(n,*n2) 

nj  =  #  segments  in  R  h  n2  =  #  segments  in  IV 

Touches, 

Above 

Insideof 

Point  P, 
Region  R 

A  boolean 

O(n) 

n  =  #  segments  in  R 

- 

Outsideof 

Point  P, 
Region  R 

A  boolean 

O(n) 

n  =  #  segments  in  R 

Insideof 

Subcurveof 

Curve  Ci, 
Curve  C2 

A  boolean 

0(n,*n2) 

ni  =  #  segments  in  Cj,  n2  =  #  segments  in  C2 

On 

Subregionof 

Region  R1; 
Region  R2 

A  boolean 

0(n,*n2) 

nj  =  #  segments  in  R  h  n2  =  #  segments  in  IV 

Insideof, 

Intersect 

Parallel 

Curve  Ci, 
Curve  C2 

A  boolean 

O(n) 

n  =  #  points  in  Cj  or  C2 

Table  1  continued. 


Class 

PR 

Input 

Output 

Computational  complexity 

PRs  used 

Object 

Recognition 

PRs 

ScanPath 

Curve  C, 
Diagram  D, 
Relation  S 

Object  Oi, 
Object  02, 

Object  Or 

0(n*m) 

n  =  #  segments  in  C,  m  =  #  objects  in  D 

PR  for 
extracting 
relation  S 

Intersect 

Curve  Ci, 
Curve  C2, 

Curve  Cr, 
Region  R1; 
Region  R2, 

Region  R, 

Point  Pb 

Point  Ps, 

Curve  Ci, 

Curve  Cp, 
Region  Rb 

Region  R^, 

0((n+k)*log(n)) 

r  t 

n  =  ^  #  segments  in  C;  +  ^  #  segments  in  R . 

i=i  j=i 

k  =  #  intersections 

- 

Table  2.  Selected  Action  Routines  and  Their  Computational  Complexities 


AR 

Input 

Output 

Computational  complexity 

PRs/ARs  used 

Translate 

Object  O, 

Real  number  t 

Object  0’ 

O(n) 

n  =  #  points  in  the  input  object 

- 

Rotate 

Object  0, 

Real  numbers 
(x,  y,  9) 

Object  0’ 

O(n) 

n  =  #  points  in  the  input  object 

Medial 

Polygon, 

Polygon  Vertices 

Medial  Axis 

O(n*log(n)) 

n  =  #  sampled  points  in  the  input  polygons 

- 

PathFinder 

Polygon, 

Polygon 

Vertices, 

Start  Pt, 

End  Pt 

0(2k)  Paths, 
Path 

Lengths 

0(n2+kk) 

n  =  #  sampled  points  in  the  input  polygons 
k  =  #  polygons  (k«n) 

Medial 

Homotopic 

Path], 

Path2, 

Point  Pi, 

Point  P2, 

Point  Pn 

A  boolean 

0(n*p) 

n  =  #  input  points 

2 

p  =  ^#  segments  in  Path 

i=l 

Insideof 

ModifyPath_ 

to_avoid_ 

obstacles 

Path, 

Diagram  OldD, 
Diagram  NewD 

0(2k)  Paths, 
Path 

Lengths 

0(n2+kk) 

n  =  #  sampled  points  in  the  input  polygons 
k  =  #  polygons  (k«n) 

PathF  inder, 
Homotopic 

ModifyPath_ 

to_pass_ 

through_ 

given_points 

Path, 

Diagram  D, 

Point  Pj, 

Point  P2, 

Point  Pr 

0(2k)  Paths, 
Path 

Lengths 

0(r*(n2+kk)) 

n  =  #  sampled  points  in  the  input  polygons 
k  =  #  polygons  (k«n) 
r  =  #  points  to  pass  through 

PathF  inder, 
Homotopic 

ShortenPath 

Path, 

Point  Pi, 

Point  P2, 

Point  Pn 

ShorterPath 

0(n*p*r) 

n  =  #  input  points 

p  =  #  segments  in  the  input  path 

r  =  #  iterations  desired/required  for  convergence 

- 

On(Point  P,  Curve  C)  checks  whether  P  lies  on  C  or 
not.  It  is  noteworthy  that  the  point  might  not  necessarily 
be  one  of  the  points  describing  the  curve  but  still  lie  on 
the  curve  by  lying  on  one  of  its  segments.  On(Curve  Cb 
Curve  C2)  is  computed  by  checking  whether  each  segment 
of  Ci  lies  on  C2  or  not.  Subcurveof(Curve  C1;  Curve 
C2)  checks  whether  Ci  is  a  sub-curve  of  C2  or  not.  For  Q 
to  be  a  sub-curve  of  C2,  each  segment  of  C|  must  lie  on  C2 
and  Ci  has  to  be  smaller  in  length  than  C2.  So  we  check 
whether  there  exists  a  segment  of  Ci  that  does  not  lie  on 
C2  using  PR  On  to  infer  the  result. 

Subregionof (Region  Rb  Region  R2)  computes 
whether  R  is  a  sub-region  of  R2  or  not.  In  order  for  R]  to 
be  a  sub-region  of  R2,  Ri  has  to  be  inside  R2.  If  some 
points  or  segments  belonging  to  the  periphery  of  Ri  lie  on 
the  periphery  of  R2,  still  we  consider  Ri  to  be  a  sub-region 
of  R2.  Touche s(Object  Ob  Object  02)  checks  whether 
objects  Oi,  02  touch  each  other  or  not.  We  consider  Oi 
and  02  to  touch  each  other  if  they  have  at  least  one  point 
in  common  on  the  periphery  but  no  point  of  one  object 
lies  inside  the  other  object.  Subsumption  relations  are 
especially  important  and  useful  to  keep  track  of  as  objects 
emerge  or  vanish. 

Abstractions  of  groups  of  objects  into  higher  level 
objects.  Objects  may  be  clustered  hierarchically  into 
groups,  such  that  different  object  abstractions  emerge  at 
different  levels.  For  example,  some  events  in  military  and 
meteorology  domains,  are  characterized  by  a  large 
number  of  individual  moving  elements,  either  in  pursuit 
of  an  organized  activity  (as  in  military  operations)  in 
groups  at  different  levels  of  abstraction,  or  subject  to 
underlying  physical  forces  (as  in  weather  phenomena). 
Visualizing  and  reasoning  about  happenings  in  such 
domains  are  often  facilitated  by  abstracting  the  mass  of 
data  into  diagrams  of  group  motions,  and  overlaying  them 
on  diagrams  that  abstract  static  features,  like  terrain,  into 
regions  and  curves.  Constructing  such  diagrams  of 
motions  at  multiple  levels  of  abstraction  calls  for 
generating  multiple  hierarchical  grouping  hypotheses  at 
each  sampled  time  instant,  then  choosing  the  best 
grouping  hypothesis  consistent  across  time  instants,  and 
hence  following  the  groups  to  produce  spatial 
representations  of  the  spatiotemporal  motions.  For  a 
detailed  discussion  and  implementation  of  such  a  high 
level  PR,  the  reader  is  referred  to  [Banerjee,  et  al,  2003; 
Chandrasekaran,  et  al,  2002].  Other  PRs  in  this  class 
might  include  generating  associations  between  objects 
based  on  properties  like  those  of  Gestalt  principles. 

Domain-specificity.  Perceptions  may  be  domain- 
specific  because  they  are  of  interest  only  in  some 
domains,  e.g.,  “an  L-shaped  region.”  They  may  also  be 
domain-specific  in  that  they  combine  pure  spatial 
perception  with  domain-specific,  but  non-spatial, 
knowledge.  For  example,  in  a  military  application,  a 


curve  representing  the  motion  of  a  unit  towards  a  region 
might  be  interpreted  as  an  attack,  but  that  interpretation 
involves  combining  domain-independent  spatial 
perceptions  -  such  as  extending  the  line  of  motion  and 
noting  that  it  intersects  with  the  region  -  with  non-spatial 
domain  knowledge  -  such  as  that  the  curve  represents  the 
motion  of  a  military  unit,  that  the  region’s  identity  is  as  a 
military  target  belonging  to  a  side  that  is  the  enemy  of  the 
unit  that  is  moving,  etc.  In  our  current  implementation,  it 
is  the  task  of  the  problem  solver  to  combine  appropriately 
the  domain-independent  perceptions  with  domain-specific 
knowledge  to  arrive  at  such  conclusions,  but  in 
application-dependent  implementations  of  the 
architecture,  some  of  these  perceptions  might  be  added  to 
the  set  of  PRs. 

5.  ACTION  ROUTINES  (ARs) 

The  problem  solving  process  may  modify  the 
diagram  -  create,  destroy,  or  modify  objects.  Typically, 
the  task  -  the  reverse  of  perception  in  some  sense  - 
involves  creating  the  diagram  such  that  the  shapes  of  the 
objects  in  it  satisfy  a  symbolically  stated  constraint,  such 
as  “add  a  curve  from  point  A  to  point  B  that  goes  midway 
between  regions  Rj  and  R2,”  and  “modify  the  object  Oi 
such  that  point  P  in  Oi  touches  point  Q  in  object  02.” 
Again  similar  to  PRs,  ARs  can  vary  in  generality. 
Deleting  named  objects  that  exist  in  the  diagram,  and 
adding  objects  with  given  spatial  specifications,  e.g.,  Add 
point  at  coordinate.  Add  curve  <equation>,  etc.,  are  quite 
straightforward.  Our  ARs  include  translation  and  rotation 
of  named  objects  for  specified  translation  and  rotation 
parameters. 

The  military  domain  calls  for  a  special  type  of  action 
routine  that  constructs  paths  satisfying  constraints.  The 
entity-reidentification  example  calls  for  ARs  that  find  one 
or  more  representative  paths  from  point  A  to  point  B, 
such  that  intersections  with  a  given  set  of  objects  are 
avoided.  A  representative  path  is  a  curve  object  and  has 
the  right  qualitative  properties  (e.g.  avoid  specific 
regions),  but  is  a  representative  of  a  class  of  paths  with 
those  qualitative  properties,  members  of  which  may  differ 
in  various  quantitative  dimensions,  such  as  length.  Given 
a  set  of  regions  in  a  boundary  of  interest,  the  medial  axis 
of  the  boundary  considering  the  regions  as  holes  in  it  can 
be  computed  in  0(n*log(n))  time  where  n  is  the  total 
number  of  sampled  points  in  the  boundary  and  the  regions 
[Kirkpatrick,  1979].  It  can  be  easily  shown  that  at  least 
one  path  from  any  homotopy  class  can  be  derived  from 
the  medial  axis.  Since  the  number  of  homotopy  classes  is 
infinite,  we  extract  only  those  paths  from  the  medial  axis 
that  do  not  intersect  themselves,  and  consider  them  as 
representative  paths  (see  Fig  3).  There  will  be  at  most 
0(2k)  representative  paths  where  k  is  the  number  of 
regions.  Our  Medial  and  PathFinder  ARs  compute 
the  medial  axis  and  the  representative  paths  respectively. 


For  their  computational  complexities,  see  Table  2.  Two 
paths  are  considered  homotopic  if  one  can  be 
continuously  deformed  into  the  other  without  crossing 
any  obstacle.  The  AR  Homotopic  computes  whether 
two  given  paths,  with  the  same  endpoints,  are  homotopic 
to  each  other  or  not  by  checking  whether  there  exists  any 
point  inside  the  region(s)  formed  by  the  paths. 


Fig  3.  Path  generation  using  AR  Medial. 


our  case,  we  are  interested  not  always  in  the  shortest  path 
but  also  in  the  shorter  versions  of  a  path  in  a  given 
homotopy  class  as  that  saves  computational  costs  in  many 
cases  while  in  some  other  cases,  we  just  need  a  smooth 
path  that  is  close  to  being  shortest  but  not  an  absolute 
shortest  as  the  shortest  path  might  have  sharp  turns 
through  which  it  is  sometimes  difficult  to  navigate.  The 
AR  Short enPath  shortens  a  path  gradually  until  the 
absolute  shortest  configuration  is  reached.  After  each 
iteration,  the  algorithm  produces  a  path  shorter  than  the 
one  after  the  last  iteration. 

Extending  lines  indefinitely  in  certain  directions  so 
that  a  PR  can  decide  if  the  extended  line  will  intersect 
with  an  object  of  interest  is  one  that  we  need  in  our 
domain.  Other  researchers  have  found  specific  sets  of 
ARs  that  are  useful  for  their  tasks,  such  as  the  AR  in 
[Lindsay,  1998],  “Make  a  circle  object  that  passes  through 
points  A,  B,  and  C.”  An  AR  that  we  have  not  yet  used, 
but  we  think  would  be  especially  valuable,  is  one  that 
changes  a  region  object  into  a  point  object  and  vice  versa 
as  the  resolution  level  changes  in  problem  solving,  such 
as  a  city  appearing  as  a  point  in  a  national  map,  while  it 
appears  as  a  region  in  a  state  map. 


ARs  can  also  modify  given  paths  to  satisfy  certain 
constraints,  for  e.g.,  a  given  path  might  need  to  be 
modified  because  of  detection  of  a  new  obstacle  (region 
object)  along  its  way,  or  a  given  path  might  be  required  to 
pass  through  certain  given  points,  and  so  on.  We  have 
implemented  routines  that  perform  certain  tasks  useful  for 
the  information  fusion  domain.  The  AR 
ModifyPath_to_avoid_obstacles  modifies  a 
path  to  avoid  newly  found  obstacles,  by  extracting  all  the 
representative  paths  from  the  new  set  of  obstacles  (old  set 
of  obstacles  and  the  newly  found  obstacles)  and  outputs 
those  representative  paths  which  are  homotopic  to  the 
given  path  with  respect  to  the  old  set  of  obstacles.  The  AR 
ModifyPath_to_pass_through_given_points 
modifies  a  path  to  pass  through  a  sequence  of  points,  by 
considering  each  consecutive  pair  of  points  in  the 
sequence  as  the  starting  point  and  the  end  point  and 
extracting  all  representative  paths  between  them,  and  then 
concatenating  the  paths  to  end  up  with  at  most  0(r*2k) 
paths,  where  k  is  the  number  of  obstacles  and  r  is  the 
number  of  points  in  the  sequence.  The  AR  outputs  only 
those  paths  that  are  homotopic  to  the  given  path.  Table  2 
gives  the  computational  complexities  of  these  routines.  It 
is  noteworthy  that  these  ARs  are  not  primitive  ARs,  rather 
they  are  built  using  primitive  ARs  such  as  Medial  and 
PathFinder. 

The  set  of  ARs  also  includes  routines  that  adjust  a 
path  in  a  homotopy  class  to  be  shorter,  longer,  etc.,  in 
various  ways.  One  useful  AR  is  shortening  a  given  path. 
Finding  the  shortest  path  in  a  given  homotopy  class  is  a 
very  well-defined  problem  in  computational  geometry.  In 


Underspecification  of  spatial  properties  of  objects. 
More  generally,  each  of  the  PRs  can  be  reversed  and  a 
corresponding  AR  imagined.  For  example,  corresponding 
to  the  PR  Insideof(Ri,  R2)  is  the  AR,  “Make  region  R2 
such  that  Insideof  (RbR2)  is  true,”  (assuming  region  R 
exists);  and  corresponding  to  Length(curve  Ci)  is  the 
AR,  “Make  curve  Ci  such  that  Length(Ci)  <  5  units.”  In 
most  such  instances,  the  spatial  specification  of  the  object 
being  created  or  modified  is  radically  under-defined. 
Depending  on  the  situation,  random  choices  may  be 
made,  or  certain  rales  about  creation  of  objects  can  be 
followed.  Flowever,  the  problem  solver  needs  to  keep 
track  of  the  fact  that  the  reasoning  system  is  not 
committed  to  all  the  spatial  specification  details. 

6.  RELATED  WORK 

Ullrnan  [Ullman,  1984]  proposed,  there  exists  a  fixed 
set  of  low-level  elemental  operations,  such  as  shifting  of 
processing  focus,  selection  of  salient  locations,  defining  a 
region  of  interest,  marking  locations  already  visited,  etc. 
that  might  be  efficiently  composed  into  visual  routines, 
such  as  visual  search,  texture  segregation,  contour 
grouping,  and  in  this  manner  extract  an  essentially 
unbounded  variety  of  shape  properties  and  spatial 
relations.  A  closely  related  procedural  approach  was 
proposed  in  [Just  &  Carpenter,  1976],  examining  visual 
tasks  such  as  mental  rotation,  from  a  higher-level 
perspective. 

Hayhoe  [Hayhoe,  2000]  argues  that  vision  can  be 
thought  of  as  the  ongoing  execution  of  task-specific 


routines  which  can  be  composed  into  extended  behavioral 
sequences.  Based  on  Newell’s  conceptualization  of 
brain’s  temporal  hierarchy  [Newell,  1990],  Hayhoe 
showed  using  an  example  of  autonomous  driving  that  the 
routines  depend  critically  on  the  immediate  behavioral 
context.  For  the  same  domain,  visual  routines  such  as 
traffic  light  detection,  stop  sign  detection,  intersection 
detection,  looming  detection,  vehicles  detection,  obstacle 
detection,  road  detection,  etc.  were  developed  [Salgian  & 
Ballard,  1998].  Rao  and  Ballard  [Rao  &  Ballard,  1995] 
proposed  visual  routines  such  as  object  identification, 
object  location  identification,  looming  detection  — 
composing  these  with  different  parameters  allows 
complex  visual  behaviors  to  be  obtained. 

Spatial  relations  have  been  classified  in  different 
classes  --  topological  relations  (e.g.  disjoint),  direction 
relations  (e.g.  north,  east),  distance  relations  (e.g.  far, 
near),  inclusion  relations  (e.g.  in,  at),  and  fuzzy  relations 
(e.g.  next  to,  close  to)  [Pullar  &  Egenhofer,  1988]. 

Our  notion  of  PRs  is  based  on  a  notion  of 
composable  and  extensible  primitives,  but  more  oriented 
to  the  needs  of  problem  solving  with  diagrams.  These 
routines  operate  on  inteipreted  images  i.e.  in  the  realm  of 
what  is  referred  to  as  transformation  (imagery)  processes 
in  [Papadias  &  Kavouras,  1994].  Because  of  our  interest 
in  generic  objects,  aspects  of  our  proposals  are  intended  to 
be  domain-independent  as  much  as  possible. 

7.  CONCLUSIONS 

An  architecture  for  representing  and  reasoning  with 
diagrams,  and  its  applications  to  some  situation 
understanding  and  planning  problems  of  Army  interest 
have  been  described  in  our  earlier  work.  The  focus  of  the 
current  paper  has  been  on  the  family  of  perceptual  and 
diagram  construction  algorithms  -  or,  perception  and 
action  routines  as  we  have  called  them  -  that  have  been 
found  useful  in  these  applications.  We  describe  their 
algorithmic  basis,  and  characterize  their  computational 
complexity  properties.  Research  of  the  type  reported  will 
help  in  building  practical  decision  support  systems  with 
manageable  complexity.  Even  though  they  were 
motivated  by  Army  problems,  we  believe  that  the  routines 
described  are  applicable  to  diagrammatic  reasoning  in 
general.  While  the  set  of  such  routines  is  open-ended,  we 
think  that  the  routines  we  have  described  will  provide  a 
good  portion  of  the  base  set  out  of  which  more  complex 
routines  can  be  built. 
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